New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fixed #33308 -- Added support for psycopg version 3 #15687
Conversation
Looking through the code base there are quite a few areas where it would probably be easier if we just assumed that if psycopg3 is installed that we want to use it; this might get a bit more fun for testing (extra environment, but realistically speaking we want to be on psycopg3 only in the longrun anways…) |
We are down to three failures :) |
d438338
to
03d5568
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice work picking this up @apollo13 . I've had a quick look and made some comments. I hope to find time to test this with a client project test suite in a week or so.
@timgraham I'd love if you could look over this and maybe test your cockroachdb backend against this. It would be great if I don't fully break it :) |
Besides the issue in
The old SQL: SELECT DISTINCT "ordering_author"."id",
"ordering_author"."name",
"ordering_author"."editor_id",
(SELECT Max(U0."pub_date") AS "last_date"
FROM "ordering_article" U0
WHERE ( U0."author_id" = ( "ordering_author"."id" )
AND Upper(U0."headline" :: text) LIKE Upper(
'%Article%') )
GROUP BY U0."author_id") AS "last_date",
(SELECT Max(U0."pub_date") AS "last_date"
FROM "ordering_article" U0
WHERE ( U0."author_id" = ( "ordering_author"."id" )
AND Upper(U0."headline" :: text) LIKE Upper(
'%Article%') )
GROUP BY U0."author_id") IS NULL,
(SELECT Max(U0."pub_date") AS "last_date"
FROM "ordering_article" U0
WHERE ( U0."author_id" = ( "ordering_author"."id" )
AND Upper(U0."headline" :: text) LIKE Upper(
'%Article%') )
GROUP BY U0."author_id")
FROM "ordering_author"
ORDER BY (SELECT Max(U0."pub_date") AS "last_date"
FROM "ordering_article" U0
WHERE ( U0."author_id" = ( "ordering_author"."id" )
AND Upper(U0."headline" :: text) LIKE Upper('%Article%') )
GROUP BY U0."author_id") IS NULL,
(SELECT Max(U0."pub_date") AS "last_date"
FROM "ordering_article" U0
WHERE ( U0."author_id" = ( "ordering_author"."id" )
AND Upper(U0."headline" :: text) LIKE Upper('%Article%') )
GROUP BY U0."author_id") DESC The new SQL: SELECT DISTINCT "ordering_author"."id",
"ordering_author"."name",
"ordering_author"."editor_id",
(SELECT Max(U0."pub_date") AS "last_date"
FROM "ordering_article" U0
WHERE ( U0."author_id" = ( "ordering_author"."id" )
AND Upper(U0."headline" :: text) LIKE Upper(
'%Article%') )
GROUP BY U0."author_id") AS "last_date",
(SELECT Max(U0."pub_date") AS "last_date"
FROM "ordering_article" U0
WHERE ( U0."author_id" = ( "ordering_author"."id" )
AND Upper(U0."headline" :: text) LIKE Upper(
'%Article%') )
GROUP BY U0."author_id") IS NULL,
(SELECT Max(U0."pub_date") AS "last_date"
FROM "ordering_article" U0
WHERE ( U0."author_id" = ( "ordering_author"."id" )
AND Upper(U0."headline" :: text) LIKE Upper(
'%Article%') )
GROUP BY U0."author_id")
FROM "ordering_author"
ORDER BY 5 DESC It might be that because CockroachDB has |
This is probably a result of https://github.com/django/django/pull/15687/files#diff-f58de2deaccecd2d53199c5ca29e3e1050ec2adb80fb057cdfc0b4e6accdf14fR753-R769 but if it looses the second |
@timgraham I can reproduce your query issue when I set |
d074e9d
to
b641d99
Compare
Yes, the failure is only present on older versions of CockroachDB. |
buildbot, test on oracle. |
wow all tests passing! great work |
I tested this branch with different versions of
|
FYI, psycopg 3.1.5 has been just released, including several performance improvements. Also, FYI, I have noticed a relatively large performance impact asking for |
I have installed psycopg 3.1.5 and run the tests. The timing difference has remained almost the same as in psycopg 3.1.4 and the db setup times have increased, but this could be due to the load of my pc at that moment (even if in reality it was idle). I've updated to the results table above. Do you know if there is a way to run only the psycopg backend tests excluding the others? |
86ec886
to
15f4669
Compare
I pushed docs changes. |
9e645d1
to
9701e9a
Compare
The `psycopg2`_ module is required for use as the database adapter when using | ||
GeoDjango with PostGIS. | ||
The `psycopg`_ or `psycopg2`_ module is required for use as the database | ||
adapter when using GeoDjango with PostGIS. | ||
|
||
On Debian/Ubuntu, you are advised to install the following packages: | ||
``postgresql-x.x``, ``postgresql-x.x-postgis``, ``postgresql-server-dev-x.x``, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
``postgresql-x.x``, ``postgresql-x.x-postgis``, ``postgresql-server-dev-x.x``, | |
``postgresql-x``, ``postgresql-x-postgis-3``, ``postgresql-server-dev-x``, |
PostgreSQL 12 is the minimum version supported by Django 4.2. Since PostgreSQL 10 the version number is incremented by 1 every time, so I have updated the package names with the really available ones in the Debian/Ubuntu repositories.
PS. there is no postgresql-x-postgis-2.x
package.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is a cleanup in docs that can be backported. Please submit a separate PR with this change.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I submitted the PR #16385
I'm going to update |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Remarkably minimal. Nice 👍
A couple of comments...
I squashed commits and pushed edits to docs. |
I tested this Django branch with The only error is this But it will be a fix to do in Django Debug Toolbar. |
Thanks Simon Charette, Tim Graham, and Adam Johnson for reviews. Co-authored-by: Florian Apolloner <florian@apolloner.eu> Co-authored-by: Mariusz Felisiak <felisiak.mariusz@gmail.com>
sql.quote = _quote | ||
def load(self, data): | ||
res = super().load(data) | ||
return res.replace(tzinfo=self.timezone) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this line correct?
I'm doing something weird - using Django's connection params in a non-Django settings (I know it's not recommended, all warranty is off, regular Django not affected). The program's timezone is different from the DB timezone. This line causes the time to be read incorrectly, because it transforms e.g. 2023-04-09 19:44:32.768813+03:00
to 2023-04-09 19:44:32.768813+00:00
, which is a different time.
I wonder if this is what was really meant, or if this wants to be res.astimezone(tzinfo=self.timezone)
(with special handling of None
)? That would result in 2023-04-09 16:44:32.768813+00:00
.
Or maybe this assumes the DB timezone == app timezone? But in that case, why is this needed at all?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Replacing with return res
or return res.astimezone(self.timezone)
causes many tests to fails, e.g. in queries
or postgres_tests.test_ranges
.
Or maybe this assumes the DB timezone == app timezone? But in that case, why is this needed at all?
As far as I'm aware, this is needed because datetimes are loaded without a timezone so we have to set it manually 🤔
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Replacing with return res or return res.astimezone(self.timezone) causes many tests to fails, e.g. in queries or postgres_tests.test_ranges.
Thanks for checking, I'll try it myself and investigate more.
If you're OK with trying a little tweak, that would save me some effort setting up Django dev env. The docstring says "The timezone can be None too, in which case it will be chopped.", so I think it should be something like this:
res = super().load(data)
if self.timezone is None:
# USE_TZ=False, convert to system local time zone and make naive.
return res.astimezone(None).replace(tzinfo=None)
else:
# USE_TZ=True, convert to configured timezone.
return res.astimezone(self.timezone)
As far as I'm aware, this is needed because datetimes are loaded without a timezone so we have to set it manually thinking
As far as I can see, psycopg's TimestamptzLoader
(the super class) always returns an aware datetime in the DB (connection) timezone.
Overall, looking at 4.1.x code, I don't see any messing with timezones other than setting the DB connection's timezone, so I'll try to understand why it's needed in psycopg3 in the first place.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Overall, looking at 4.1.x code, I don't see any messing with timezones other than setting the DB connection's timezone, so I'll try to understand why it's needed in psycopg3 in the first place.
This is probably required in the Django test suite because we're changing the TIME_ZONE
setting.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looked into it, if anyone's interested (TL;DR - it's fine).
I believe the TimestamptzLoader is meant to replace this code in psycopg2 backend create_cursor
:
def create_cursor(self, name=None):
... snipped ...
cursor.tzinfo_factory = self.tzinfo_factory if settings.USE_TZ else None
return cursor
def tzinfo_factory(self, offset):
return self.timezone
In psycopg2 the tzinfo_factory
line has this effect when loading a timestamptz (see code):
- If
USE_TZ = False
, returns a naive datetime in DB connection timezone - equivalent to reading aware datetime from DB and doingreplace(tzinfo=None)
. - If
USE_TZ = True
, returns an aware datetime in DB connection timezone and replaces its timezone to Django DB timezone - equivalent to doingreplace(tzinfo=self.timezone)
.
i.e. same as TimestamptzLoader
's res.replace(tzinfo=self.timezone)
code.
So this indeed relies on the assumption that Django DB timezone == connection timezone, which Django ensures when setting up a new connection.
I think that once psycopg2 support is dropped, this assumption could be dropped, making things less dangerous. But until then, it's better to keep the psycopg2 and 3 behaviors the same.
What did I do? I took
https://github.com/dvarrazzo/django-psycopg3-backend and blackified it +
ported over most (all?) new commits. I am now opening this on GitHub to
be able to nicely diff and start a discussion about whether we can
support psycopg2 & 3 easiyl from the same codebase (I think we can).
I (i.e. @felixxm) have the following plan to move this forward: